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POWER OF CHANGE-POINT TESTS FOR LONG-RANGE DEPENDENT 

DATA 



HEROLD DEHLING, AENEAS ROOCH, AND MURAD S. TAQQU 



Abstract. We investigate the power of the CUSUM test and the Wilcoxon change-point 
tests for a shift in the mean of a process with long-range dependent noise. We derive analytic 
formulas for the power of these tests under local alternatives. These results enable us to 
calculate the asymptotic relative efhciency (ARE) of the CUSUM test and the Wilcoxon 
change point test. We obtain the surprising result that for Gaussian data, the ARE of these 
two tests equals 1, in contrast to the case of i.i.d. noise when the ARE is known to be S/tt. 



Contents 

1. Introduction 

2. Power of the CUSUM Test under Local Alternatives 

3. Power of the Wilcoxon Change-Point Test under Local Alternatives 

4. ARE of the Wilcoxon Change-Point Test and the CUSUM Test for LRD Data 

5. ARE of the Wilcoxon Change-Point Test and the CUSUM Test for IID Data 

6. Simulation Results 

6.1. Gaussian data 

6.2. Heavy tailed data 
References 



1 

3 
6 

13 

12 

22 
23 

22 

25 



1. Introduction 

Statistical tests for the presence of changes in the structure of time series are of great 
importance in a wide range of scientific discussions, e.g. regarding economic, technological 
and climate data. Many procedures for detecting changes and for estimating change-points 
have been proposed in the literature; see e.g. Csorgo and Horvath (1997) for a detailed 
exposition. In the case of independent data, the theory is quite satisfactory. For various 
types of change-point models, statistical procedures have been proposed and their properties 
investigated. In contrast, the situation is different for dependent data, such as encountered 
in time series models. For dependent data, most research has focused on linear procedures, 
such as cumulative sum (CUSUM) tests, and there are many open problems when it comes 
to other types of test procedures, e.g. those used in robust statistics. 

In the present paper, we study the change-point problem for long-range dependent data. 
Specifically, we will test the hypothesis that the process is stationary against the alternative 
that there is a change in the mean. The classical test statistic for this problem is the CUSUM 
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statistic, 



(1) 



max 

l<fc<n-l 



k , n 

^-^ n ^-^ 

i=l i=l 



When the test statistic is large, one infers that there is a change in the mean. The CUSUM 
test has good properties when the underlying process is Gaussian. The asymptotic distribu- 
tion of the CUSUM test in the presence of long-range dependent data has been investigated 
by Horvath and Kokoszka (1997). 

However, the CUSUM test is not robust against possible outliers in the data, because the 
sum X]i=i -^i can change drastically when there are outliers. Recently, Dehling, Rooch and 
Taqqu (2013) have proposed a robust alternative to the CUSUM test, which is based on the 
Wilcoxon two-sample rank statistic. The corresponding "Wilcoxon change-point test" uses 
the test statistic 



One rejects the null hypothesis when this test statistic is large. Rank tests for change-point 
problems have been studied earlier by Antoch et al (2008), in the presence of i.i.d. data, and 
by Wang (2008) for linear processes. 

In their paper, Dehling, Rooch and Taqqu (2013) investigated the asymptotic distribution 
of the Wilcoxon change-point test under the null hypothesis of no change, in the presence 
of long-range dependence. Moreover, they performed a simulation study to compare the 
finite sample performance and the power of the CUSUM test based on ([1]) and the Wilcoxon 
change-point test based on (|2])|3 

In the present paper, we study the power of the CUSUM test and the Wilcoxon change- 
point test for a shift in the mean of a long-range dependent process. We will calculate the 
power under local alternatives, where the height of the shift decreases with the sample size 
n in such a way that the tests have non-trivial limit power as n — j- oo. These results enable 
us to compute the asymptotic relative efficiency (ARE) of the CUSUM and the Wilcoxon 
change-point tests, which is defined as the limit of the ratio of the sample sizes required 
to obtain a given power. We obtain the surprising result that the ARE of these two tests 
equals 1 in the case of long-range dependent Gaussian data. This is in contrast with the 
case of i.i.d. and short-range dependent data, where the ARE of the Wilcoxon change-point 
test with respect to the CUSUM test is 3/7r. In the context of M-estimation of a location 
parameter, a similar phenomenon has been observed by Beran (1991); see also Beran (1994), 
Corollary 8.1. 

We consider a model where the observations are generated by a stochastic process (Xj)j>i 
of the type 

(3) = /ii + ei, 

where (ei)j>i is a long-range dependent stationary process with mean zero, finite variance and 
where (/ij)i>i are the unknown means. We focus on the case when (ej)j>i is an instantaneous 
functional of a stationary Gaussian process (^i)j>i with non-summable covariances, i.e. 



Dehling et al. (2013) called the CUSUM test, the "difference of means test", and called the Wilcoxon 
change-point test, the " Wilcoxon- type" test. 



(2) 
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We assume that {C,i)i>i is a long-range dependent (LRD), mean-zero Gaussian process with 
variance E{C,f) = 1 and autocovariance function 

(4) p{k) = k-^L{k), k>l, 

where < -D < 1, and where L{k) is a slowly varying function. Moreover, G : M — )■ M is a 
measurable function satisfying E{G{^i)) = 0. 

Based on observations Xi, . . . , X„, we wish to test the hypothesis 

H : fii = . . . = 

that there is no change in the means of the data against the alternative 

(5) A : fii = . . . = fik 7^ fJ'k+i = ■■■ = fJ'n, for some k e {1, . . . ,n - 1}. 

We shall refer to this test problem as {H, A). 

Dehling, Rooch and Taqqu (2013) have studied two tests for this change-point problem, 
namely the Wilcoxon change-point test which is based on the test statistic 



(6) Wn = max 



i=l j=k+l 

and the CUSUM test which uses the test statistic 

k 



k n / ^ 



(7) Dn ■■= — -r max 

ndr, \<k<n~\ 



i=i j=fc+i 



Observe that the normalization which will be specified below, is the same for both tests. 
These tests are similar in spirit. They compare the first part of the sample to the second 
part. The Wilcoxon change-point test (I6l) involves the rank of the data whereas the CUSUM 
test (I7j) involves their values. One rejects the null hypothesis of no change when these test 
statistics are large. 

Dehling, Rooch and Taqqu (2013) investigated the asymptotic distribution of these test 
statistics under the null hypothesis H of no change in the means. In addition, they calculated 
the power of these tests numerically via a Monte-Carlo simulation. In this paper, we will 
compute the power of the above test statistics under a local alternative. More specifically, 
we shall consider the following sequence of alternatives 

rsl A u {r^^■ n-[ ^ for z = 1, [nr] 

^^''^"^''^ • \ + forz= [nr] + l,...,n, 

where < r < 1. Observe that the level shift hn depends on the sample size n. 

2. Power of the CUSUM Test under Local Alternatives 
We will first investigate the asymptotic distribution of the process 

, [nA] n 

(9) D^{X) ■.= —fYl E - < A < 1. 

i=lj=lnX]+l 

To do so, we consider the Hermite expansion of G{^i), namely 

oo 
q=l ^' 
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where Hq is the g-th order Hermite polynomial. We define the Hermite rank of the function 

G as 

m = min{g > 1 : 7^ 0} 
and introduce the normalization constants 

(10) rf^^Var 

\j=i 

We suppose < D < ^, in which case 

(11) rf2 ^L"^(n), 

where \) / (1 — Dm.)(2 — Dm). Here we use the symbol a„ ~ bn to denote an/hn — )■ 

as — 7- 00. 

Under the null hypothesis H of no level shift, we get that the process (-Dn(-^))o<A<i in dSD 
converges in distribution towards the process 

(12) ^(AZ^(l)-Z^(A))o<A<i, 

ml ~ ~ 

where m denotes the Hermite rank of G and where 

(13) = E{HUOGm 

see Dehling, Rooch and Taqqu (2013), proof of Theorem 3. The process {Zm{^))x>o denotes 
the m-th order Hermite process with Hurst parameter H = 1 — G (^,1)- It is Gaussian 
(namely fractional Brownian motion) when m = 1, but it is non-Gaussian when m >2. For 
various representations of the Hermite process {Zni{X))x>o, see Pipiras and Taqqu (2010). 
In view of ([3]), under the alternative A in ([S]), we need to consider 

[nX] n [nX] n 

(14) ^n{\) = ^Y. E (Gfe)-G(e.)) + ;^E E (/^^-/^o- 

" i=\ j=[n.A]+l " i=l i=[n.A]+l 

Observe that the statistic -D„(A) presumes that the jump occurs at time [nA] + 1, whereas 
the local alternative AT^i^in) involves a jump at [nr]-|-l. There will therefore be an interplay 
between A and r. In fact, under the local alternative Ar^hS^ in ([8]), we get 

(15) ^ i^{M{n-{rn\) for A < r 
^ ^ ^^ntr,=S+i I forA>r. 
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which takes its maximum value r(l — r) at A = r; see Figured! Note that for large n, we 
get 

(17) E (H-f^d-^Mx). 

1=1 3 = [n\]+l 

Thus, in order for the second term in (|T4l) to converge as n — >■ oo, we have to choose the 
level shift hn ~ cdn/n. When n is large, this is exactly the order of the level shift that can 
be detected with a nontrivial power, that is with a power which is neither nor 1. 

Theorem 2.1. Let {^i)i>i be a stationary Gaussian process with mean zero, variance 1 
and autocovariance function as in with < D < ^. Moreover, let G : ^ M. be a 
measurable function satisfying E{G^{^)) < oo and define Xi = fii + G{^i). Then under the 
local alternative A^^/i,^(n) with 

(18) /^„~-c, 

n 

for an arbitrary constant c, the process (-Dn(A))o<A<i in ( flTP converges in distribution to the 
process 



(19) (AZ^(l) - Z„(A)) + c0.(A)) 



0<A<1 ' 



where {Zm{)^))x>o denotes the m-th order Hermite process with Hurst parameter H = 1 — 
G (|, 1), where is given by [T^) and 0r(A) by [T^) . 

Proof. We use the decomposition f fl^ . The first term on the right hand side has the 
same distribution as -D„(A) under the hypothesis, and thus converges in distribution to 
^(AZm(l) — Zm{X))- Regarding the second term, we observe that by ( fTSl) and ( fT5l) we get 

1 , . j -%[An](n — [rn]) for A < r 

:^2^ 2^ KN-M ~ \s.^ri-[\n])[rn] for A > r. 

" i=l j=[n\]+l K n ^ 

c0^(A), 

uniformly in A G [0, 1], as n — oo. □ 

Remark 2.2. (i) Observe that for c = we recover the limit distribution under the null 
hypothesis. Thus, Theorem 12.11 is a generalization of the results obtained previously under 
the null hypothesis. The limit process is a fractional bridge process. When m = 1, this 
process is a fractional Gaussian bridge. For m > 1, the process is non-Gaussian. 

(ii) Under the local alternative, i.e. when c 7^ 0, the limit process is the sum of a fractional 
bridge process and the deterministic function c 

As an application of the continuous mapping theorem, we obtain the following corollary. 

Corollary 2.3. Under the alternative Ar.hniji) with hn ~ ^c, Dn as defined in ^ converges 
in distribution to 

(20) sup ^(AZ„(l)-Z„(A)) + c0,(A) 



0<A<1 



Remark 2.4. (i) The limit distribution ( 120|) depends on the constant c. For c = 0, we 
obtain the limit distribution under the null hypthesis. Quantiles of this limit distribution 
were calculated numerically via a Monte-Carlo simulation by Dehling, Rooch and Taqqu 
(2013), Table 1. Increasing the value of |c| leads to a shift of the distribution to the right. 
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If c = oo, that is, if hn tends slower to zero than for any c > 0, then the correct 
normahzation for Dn{X) should go to oo at a higher rate which would kill the random part 
(AZm(l) — Zm{X)) in f|T9l) . and hence the level shift could be detected precisely. The power 
of the asymptotic test would be equal to 1 in this case. 

(ii) For a given r G [0, 1], the function 0t(A) takes its maximum value in A = r, and this 
maximum value equals r(l — r). Thus, for values of r close to and close to 1, r(l — r) is 
close to 0, and thus the effect of adding the term c0t-(A) is rather small. As a result, the 
power of the test is small at level shifts that occur very early or very late in the process. 

(iii) The higher the level shift and the closer A is to r, the easier it is to detect the level shift. 

(iv) If the observations are short-range dependent, one can typically detect level shifts hn 
of size ^ = but here, because of long-range dependence, the level shifts that can be 

detected are of smaller order — ; note that Dm < 1. 

We will now apply Corollary 12.31 in order to make power calculations for the change-point 
test that rejects for large values of -D„. Under the null hypothesis of no level shift, 

D 

o<A<i ml 

If we denote by qa the upper a-quantile of the distribution of supo<A<i |AZm(l) — Zm(A)|, 
we obtain 

lim Ph ( Dn > -^^go ) = P ( sup ^-^\XZmil) - ZmiX) \ > ^-^qc 
n^oo y ml J \o<A<i ml ml 

where Ph indicates the probability under the null hypothesis H. Thus, the test that rejects 
the null hypothesis H when Dn > ■^^Q'a has asymptotic level a. If hn is chosen as in ffTSjl . 
we obtain under the alternative A^-h (n) 
(21) 

la. 



lim (n) \Dn > -^ga ] = P ( SUp ^(A2'„,(l) - 2'„(A)) +C(f)r{X) 



> 



ml 



Thus, for large n, the power of our test at the alternative Ar^hnif^) is approximately given 
by the right-hand side of (12T]) . 

We may also apply Corollary 12.31 in order to determine the size of a level shift at time [rn] 
that can be detected with a given probability (3. First, we calculate c = c{a, /3) such that 

(22) P ( sup ^(AZ„(1) - Z„,(A)) + c0.(A) > = /3. 

Vo<A<i ml ml ) 

Thus, by ( 12T|) . we get that the asymptotic power of the test at the alternative Ar^hjyp) is 
equal to /3. Thus, given a sample size n, we can detect a level shift of size hn = ^c{a,/3) at 
time [rn] with probability /3 with a level a test based on the test statistic Dn- 



3. Power of the Wilcoxon Change-Point Test under Local Alternatives 

In the context of the Wilcoxon change-point test, the Hermite rank is not that of the 
function G, but of the class of functions 



(23) 
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0<A<1 



where F{x) = i?(l{G(g.)<^.}) = P{G{^i) < x). We define the Hermite expansion of the class 
of functions fl25]) as 

oo J ( \ 

kGi,.)<.}-F{x) = j2^m^), 

g=l 

where Hg is again the g-th order Hermite polynomial and where the coefficients are 
(24) Mx) = E (^r,te)l{Gte)<x.}) . 

We define the Hermite rank of the class of functions ( l23l) as 

m := min{g > 1 : Jq{;x) ^ for some x G M}. 

Theorem 3.1. Suppose that (^.t)i>i is a stationary Gaussian process with mean zero, vari- 
ance 1 and autocovariance function as in ^ with < D < — . Moreover, let G : M. ^ M. 
be a measurable function, and assume that G{^k) has continuous distribution function F{x). 
Let m denote the Hermite rank of the class of functions [23\) . let dn be as in 01]) . and let 
the level shift hn be as in / figj) . Then, under the sequence of alternatives Ar^hni defined in 
if hn ^ as n ^ oo, the process 

^ E E Mx.<x,} - - - ^0.(A) / {F{x + h^) - F{x))dF{x) 

converges in distribution towards the process 
(26) n^JMdF{x) ^^^^^^ _ ^^^^^^^ 

\ m- / o<A<i 

where {Zm{^))x>o denotes the m-th order Hermite process with Hurst parameter H = 1 — 
G (|, 1) and where Jm{x) is defined as in (24^ . 

Remark 3.2. (i) The normalization dn and the processes (^m(A))A>o in Theorem 12.11 and 
Theorem 13.11 are the same. 

(ii) Since, by assumption, the distribution F{x) of G(^fc) is continuous, it follows from integra- 
tion by parts that f^F{x)dF{x) = |. This explains the 1/2 in (!25|) because f^F{x)dF{x) = 
E{l^Xi<x[}), where X[ is an independent copy of Xi. The independence assumption is rea- 
sonable as the dependence between Xi and Xj vanishes asymptotically when |^ — j| — ?■ oo. 

(iii) As noted at the beginning of the proof, the first part of fl25|) converges to fl26l) under 
the null hypothesis. We show in the proof that the second part of ( l25l) compensates for the 
presence of the alternatives At,/i„- 

(iv) We make no assumption about the exact order of the sequence {hn)n>i- Theorem 13.11 
holds under the very general assumption that /i„ — )■ 0, as n — )■ oo. 

(v) If we choose {hn)n>i as in flTHl) . the centering constants in fl2^ converge, provided some 
technical assumptions are satisfied. To see this, observe that 

^M^) f{n^ + hn)-F{x))dF{x) ~ ^M^) f ^^^^^^T^^^dFix) 



c(f)r{X) / f{x)dF{x) 
Jr 

= c0^(A) / f^{x)dx. 



The convergence in the next to last step requires some justification - this holds, e.g. if F is 
differentiable with bounded derivative f{x). 
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Corollary 3.3. Suppose that {^i)i>i is a stationary Gaussian process with mean zero, vari- 
ance 1 and autocovariance function as in ([^ with < D < ^. Moreover, let G : M. M. be 
a measurable function, and assume that G{C,k) has a distribution function F{x) with bounded 
density f{x). Let m denote the Hermite rank of the class of functions 1{g(c,)<^} ~ -^(^); 
a; G M. Then, under the sequence of alternatives A^-^h^, defined in ([^, with hn ~ we 
obtain that 

I ^ [«A] 

(27) k^rE E \hx.<x,} 



i=l j=[nA]+l 



0<A<1 



converges in distribution to the process 
Jjj J-mi^-^^dF (^x 



ml 



(Z„(A)-AZ„(l)) + c0.(A) / f'{x)dx 



0<A<1 



Proof of Theorem \3.1[ In our proof, we will make use of the limit theorem that was 
derived in Dehling, Rooch and Taqqu (2013) under the null hypothesis. They showed (see 
Theorem 1) that 



(28) ;rVli t (lwt.).ca)> - I) ^'-'"■';r'"' (i?„(A) - AZ„(1)). 

" i=l j=[nX]+l ^ ^ 



In order to make use of this result, we will decompose the test statistic into a term whose 
distribution is the same both under the null hypothesis as well as under the alternative, and 
a second term which, after proper centering converges to zero. As in Dehling, Rooch and 
Taqqu (2013), we will express the test statistic as a functional of the empirical distribution 
function of the G{^i), namely 

1 ^ 

= J_'^'^{Gm<x}- 

i=l 

Given integers k, I with k < I we denote by Fki{x) the empirical distribution function based 
onG(efc),...,G'(6),i.e. 

1 ' 

(29) ^M(^) = 73I^El(Gte)<-}- 

i=k 

Recall that under the local alternative, we have 



G{^i) + ^ for z = 1, . . . , [rar] 

Gi^i) + fi + hn ioT i = [nr] + 1, . . . , n. 
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Thus, we obtain for A < r, 



[n\] 



E E i.-v 

i=l j=[n\]+l 



(30) 



[nX] [nr] [nX] n 

i=l j=[nX] + l i=l j = [nr] + l 

[nA] [nr] [nA] n 

,)<G(€,)+h„} 

i=l j=[nX] + l i=l j=[n.r]+l 

[nA] n [n\] n 

E E l{Gte)<Gfe)} + E E (l{G(6)<Gfe)+ft„} - l{Gte)<Gfe)}) 
j=l j = [nA]-|-l i=l j=[rtr]+l 

[nA] n [n\] ri 

E E 

MG(6)<G(5,)} + 

E E 

MGK,)<GK,)<G(C,)+h„}- 

i=l j=[nA]+l i=l j=[nr]+l 



In the same way, we obtain for A > r, 



(31) 



[nA] n 

E E 

i=l i=[nA]+l 

[nr] n [nA] n 

= E E ^{G(6)+M<Gfe)+M+/tn} + E E l{Gte)+M+/in<G(5,)+M+/in} 

i=l j = [nA]+l i=[nr]+l i=[nA] + l 

[nr] n [nA] n 

= E E E E 

MGte)<Gfe)} 

i=l j=[nA] + l 't=[nT]+l j = [nA] + l 

[nA] n [nr] n 

= E E 

l{G(?,)<Gfe)} + E E (^{f?(«.)<G(€,)+/in} - l{G(5,)<G(5j)}) 

i=l j = [nA] + l j=l j=[n.A]+l 

[nA] n [nr] n 

= E E l{G(6)<Gfe)} + E E l{Gfe)<G(5,)<Gfe)+/i„} 

i=l j=[nX]+l i=l j=[n\]+l 



Thus, in order to prove Theorem I3.H it suffices to show that the following two terms. 



(32) 



— - sup 

n CLn 0<A<T 



[nA] n „ 

E E l{Gfc)<Gte)<Gfe)+/.„}-^'A(l-r) / (F(a; + M-i^(a;))ciF(a;) 

,•-1 • r 1,1 -^K 



i=l j = [nr] + l 



and 



(33) 



nd. 



sup 

n r<A<l 



nr n 



E E l{Gfe)<G{6)<Gfe)+M-'^'^(l-^) / + 
i=l i=[nA]+l ''^ 
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both converge to zero in probability. We first show this for fj32l) . Observe that 
Yl Yl ^{Gfe)<G(6)<Gfe)+/.„} -r) / {F{x + hn) - F{x))dF{x) 

1=1 j=[nT]+l 
n 

= [nX] J2 (i^[n,A](Gte) + /i„)-F[„,](Gte))) 

j=[nT] + l 



-n'A(l-r) / {F{x + K)-F{x))dF{x) 
Jr 

[nX\{n- [nr]) / (F[„A](a; + - i^[„A](a;)) rfF[„,r]+i,„,(x) 
Jr 

-n'^\{l-r) [ {F{x + K) - F{x))dF{x) 
Jr 

[nX]{n-[nT]) ( / {F[nx]{x + K) - F[nx]{x))dF[nr]+i,n{x) - / {F{x + K) - F{x))dF{x] 
\Jr Jr 

+ {[nX\{n - [nr]) - n'^\{l - r)) [ {F{x + K) - F{x))dF{x) 



Note that \[n\]{n - [nr]) - n^\{l - r)| < n and | /^(^(x + K) - F{x))dF{x)\ < 1. Thus 
^ {[n\]{n - [nr]) - n^X{l - r)) I {F{x + K) - F{x))dF{x) < ^ 0, 



n dn d. 

as n — > oo. Hence, in order to show that converges to zero in probabihty, it suffices to 

show that 

(34) 

-—[n\]{n-[nT]) ( / (F[„a](x + K) - F[„A](a;))rfF[„,]+i,n(x) - / {F{x + K) - F{x))dF{x 

converges to zero, in probabihty. In order to prove this, we rewrite the difference of the 
integrals in fl51|) as 



(35) / (F[„A](x + K) - F[„A](x)) rfF[„,]+i,„(x) - / (F(x + K) - F{x)) dF{x) 

Jr 

{F[nX]{x + hn) - F[nX]{x)) - {F{x + hn) - F{x)) dFlnr]+l,n{^) 

+ I (F(x + /i„)-F(x))c/(F[„,,]+i,„-F)(x) 

IR 

{F^n\]{x + hn) - F[nX]{x)) - (F^X + /l„) - F{x)) dF[nr]+l,n{^) 

- / (F[„,]+i,„(x)-F(x))d(F(x + /i„)-F(x)), 
Jr 

where we have used integration by parts in the final step. Thus, in order to prove that ([3] 
converges to zero, it suffices to show that the following two terms converge in probability, as 
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77, -> 0, 

(36) ^[n\] I {{F^nxix + K) - F[„,A](x)) - {F{x + K) - F{x))) dF[„,,]H_i,„,(x) ^ 

(37) j-(n - [nr]) [ (F[„,,]+i,„(a;) - F(x)) d{F{x + /i„,) - F(x)) ^ 0. 

In order to prove fl36l) and fl37|) . we now apply the empirical process non-central limit theorem 
of Dehling and Taqqu (1989) which states that 



(C^[nA](F[„,](x)-F(x)))^, 



V 



e[-oo,oo],A6[0,l] 



(J (x)Z(A))^g[_oo,oo],Ae[0,l]5 



where 



J(x) = J„,(x) = E (l|G(5,)<x}i^m(6)) and Z{\) = '^"'^^ ■ 



By the Dudley- Wichura version of Skorohod's representation theorem (see Shorack and Well- 
ner (1986), Theorem 2.3.4) we may assume without loss of generality that convergence holds 
almost surely with respect to the supremum norm on the function space D([0, 1] x [— oo, oo]), 
i.e. 



(3^ 



sup \d^^^[nX\{F],n\]{x) — F[x)) — J{x)Z[X)\ — )• a.s. 

A6[0,l],2:eR 



Note that by definition, for any A < r 

{[nr] - [n\]){FynX]+Unr]{x) - F{x)) = [nr](F[„,](x) - F{x)) - [nA] (F[„a] (a;) - F{x)). 

Hence, we may deduce from (138|) the following limit theorem for the empirical distribution 
of the observations X[„a]+i, • • • , -^[nr], 

(39) sup \d~\[nT] - [nA])(F[„A]+i,M(x) - F{x)) - J{x){Z{t) - Z{\))\ ^ 0, 

0<A<r,a;eR 

almost surely. In the same way, we obtain 

(40) sup \d-\n - [nA])(F[„A]+i,n - F{x)) - J(x)(Z(l) - Z(A))| ^ 0, 

0<A<l,a;eR 

almost surely. Now we return to (!36l) and write 

^-[n\] ((F[„A](a: + K) - F[nA](a:)) - {F{x + K) - F{x))) rfF[„,]+i,„(x) 



(41) 



d„ 



< 



{J{X + hn) - J{x))Z{X)dF[nr]+l,n{x) 
1 



+ sup 

xeIR,0<A<l 



dn 



nX] ((F[„A](a; + K) - F[nA](a:)) - {F{x + K) - F{x))) 



-{J{x + K)- J{x))Z{X) 



< 



(J(X + hn) - J{x))dFlnr] + l,n{x) 



sup |Z(A)| 
0<A<1 



sup 

a;eM,0<A<l 



— [nX] {{F[nX]{x + hn) - F[nX]{x)) - {F{x + hn) 



Fix))) 



iJ{x + hn)-J{x))Z{X) 
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The second term on the right-hand side converges to zero by fl38|) . Concerning the first term, 
note that 

(42) J{x)= / liGiy)<x}Hm{y)(p{y)dy = - / l{^<G{y)}Hm{y)(p{y)dy, 

Jr Jr 

where (j){y) = denotes the standard normal density function. For the second 

identity, we have used the fact that G{$,), by assumption, has a continuous distribution, and 
that J^Hm{y)4>{y)dy = 0, for m > 1. Using P2l) we thus obtain 





(43) / J{x)dF[nr]+l,n{^) = " / / ^{x<G{y)}Hm{y)(l)iy)dydF[nr] + l,ni^) 

'^{x<G{y)}dF[nT] + l,n{^)Hm.{y)4>{y)dy 

F[nr]+l,niGiy))Hm{y)(l){y)dy, 

and, using analogous arguments, 

(44) / J{x + hn)dFinr]+i,n{^) = - FynT]+i,n{G{y) - hn)H^{y)(t){y)dy . 
Jr Jr 

By the Glivenko-Cantelli theorem, applied to the stationary, ergodic process (G(^j))j>i, we 
get sup^gjj \Fn{x) — F{x) \ — 0, almost surely. Since 

F[nT] + l,n{^) = "T iFnix) (x) , 

n — [riT] n — [nrj 

we get that, almost surely, 

(45) sup |F[„,^]+i,„(a;) - F{x) \ 0. 



X6I 



Returning to the first term on the right-hand side of (jHj), we obtain, using fl43|l and 

(J(X + hn) - J{x)) dF[nr]+l,n{x] 

{F[nr]+i,niGiy) " hn) - F[„^]+i,„,(G(?/))) Hm{y)(t){y)dy 

< I \F{G{y)-K)-F{G{y))\\HUy)\<l>{y)dy 
Jr 

+2sup\F[nr]+i,n{^) - F{x)\ / \Hm{y)\(l){y)dy. 
X Jr 

Both terms on the right-hand-side converge to zero; the second one by ( H5l) . the first one by 
continuity of F, the fact that hn — J- 0, and Lebesgue's dominated convergence theorem. In 
both cases, we have made use of the fact that J \Hm{y)\<P{y)dy < oo. Thus we have finally 
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established f p6|) . In order to prove fl371) . we observe that 

^■(n - [nr]) [ - F{x)) d{F{x + K) - F{x)) 



dr, 



< 



J{x){Z{t) - Z{l))d{F{x + /lO - F{x)) 



+ sup 



d. 



■{n - K])(F[„,]+i,„(a;) - F{x)) - J{x){Z{r) - Z{1)) 



< 



J{x)d{F{x + K) - F{x)) 



\Z{r)-Z{l] 



+ sup 



-(n - K])(F[„.]+i,„(a:) - F{x)) - J{x){Z{t) - Z{1)) 



The second term on the right-hand side converges to zero, by fj40|) . Concerning the first 
term, note that 

' J{x)d{F{x + K) - F{x)) = E {J{G{ii) - K) - J{G{ii))) . 

Applying Lebesgue's dominated convergence theorem and making use of the fact that, by 
assumption, J is continuous, we obtain that J{x)d{F{x + hn) — F{x)) — )• 0. In this way, 
we have finally proved that fl32l) converges to zero, in probability. By similar arguments, we 
can prove this for (l33l) , which finally ends the proof of Theorem I3.1[ □ 

4. ARE OF THE WiLCOXON Change-Point Test and the CUSUM Test for LRD 

Data 

In this section, we calculate the asymptotic relative efficiency (ARE) of the Wilcoxon 
change-point test with respect to the CUSUM test. To do so, we calculate the number of 
observations needed to detect a small level shift h at time [rn] with a test of given level 
a and given power (3, both for the Wilcoxon change-point test and the CUSUM test, and 
denote these numbers by nw and nc, respectively. We then define the asymptotic relative 
efficiency of the Wilcoxon change-point test with respect to the CUSUM test by 



(46) 



ARE{W,C) = Urn—. 



It will turn out that the limit (146|) exists and that the asymptotic relative efficiency does not 
depend on the choice of r, a, /3. If this limit is larger than 1, then the CUSUM test requires 
a larger sample size to detect the level shift, and hence the Wilcoxon change-point test is 
(asymptotically) more efficient. 

In the remaining part of this section, we will focus on the case when m = 1 both for the 
CUSUM as well as the Wilcoxon change-point test, i.e. when the Hermite rank of G(^i) and 
of the class of functions i{G(^i)<x} — F{x),x e M, are both equal to 1. This is the case, for 
example, when G is a strictly monotone function. In this case 

Mx)dF{x) = 

see Relation (20) in Dehling, Rooch and Taqqu (2013), showing that the Hermite rank of 
the class of functions 1{g(Ci)<x} — F{x),x G M equals 1. Focusing now on G{C,i) and using 
integration by parts, we get that the first order Hermite coefficient ai of G equals 



ai = E(G(Ci)Ci) = / G{x)x(j){x)dx 



G{x)(f)'{x)dx = / (j){x)dG{x) > 
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where = 




denotes the standard normal density function. Thus, the Hermite 



rank of G{C,i) equals 1, as well. 

In this case, i.e. when m = 1, the Hermite process arising as limit in Theorem 12.11 
Theorem 13.11 and Corollary 13.31 is fractional Brownian motion (-Bh(A))o<a<i- Note that 
fractional Brownian motion is symmetric, i.e. (— i?_f/(A))o<A<i has the same distribution as 
(i?//(A))o<A<i- Thus the limit processes in Theorem 12. II and Corollary 13. II can also be written 
as 

(47) (|«i|(5h(A) - XEnil)) + cM>^))o<x<i > 



(4J 



Ji{x)dF{x] 



{BH{\)-\BH{l))+C<Pr{\) / f{x)dx 



0<A<1 



As preparation, we first calculate a quantity that is related to the asymptotic relative 
efficiency, namely the ratio of the sizes of level shifts that can be detected by the two tests, 
based on the same number of observations n, again for given values of r, a, (3. We denote the 
corresponding level shifts by hw{n) and hc{n), respectively, assuming that these numbers 
depend on n in the following way: 

(49) hwin) = cw — 

n 

(50) hcH = CC-. 

n 

In order to simplify the following considerations, we take a one-sided change-point test, thus 
rejecting the hypothesis of no change-point for large values of 

k n 



max 

l<fc<r 

i=l j=k+l 



or 



■t=i j=k+i 

respectively. These are the appropriate tests when testing against the alternative of a non- 
negative level shift. In order to obtain tests that have asymptotically level a, the CUSUM 
test and the Wilcoxon change-point test reject the null-hypothesis when 

^ k n 

(51) , , , max V V (X,- - Xi) > 

' ' i=l j=k+l 



(52) 

n 



1 k n . 



where denotes the upper a quantile of the distribution of supo<a<i(-Bh(A) — XBuiX))- 
This follows from Theorem 12.11 and Corollary 13.31 after applying the continuous mapping 
theorem. The constants Oi and the functions Ji(x) are defined in (ITS]) and ^IM . respectively, 
and have just been computed. Under the sequence of alternatives ^^ /^^(n), the asymptotic 
distribution of the test statistic in f lSTjl is given by 

sup (bh{\) - XBh{1) + ^0r(A) 
o<A<i V K^i 
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see Theorem 12. 1[ Under the sequence of alternatives Ar^hwHy asymptotic distribution 
of the test statistic in flS2]) is given by 

sup fi?H(A)-Ai?H(i)+ 7y'^g;^; 0.(A) 

o<A<i V \jRMx)dF{x)\ 
see Corollary 13.31 Thus, the asymptotic power of the CUSUM test is given by 

k n 

—— — - max {Xj - Xi) > 

" ' ^' i=\ j=k+i 

(53) = P f sup (bh{\) - \Bh{1) + ^4>rW] > qc 



vO<A<l V pi 

In the same way, we obtain the power of the Wilcoxon change-point test 

(54) = P f sup (b^{X) - AP^(l) + |T y3Fr'^> .(A)) > 

Vo<A<i V \UJi{x)dF{x)\ J 

Thus, if we want the two tests to have identical power, we have to choose cc and cw in such 
a way that 

^'^^(^) = I r 7 ^ ^^F^ M '/'r(A), 
which again yields by (H9!) and (!50l) . 

^^^^ /ic('^) _ Cc _ |ai| /M/^(a;)c?a; 



W('^) ciy 1 4 Ji(x)rfP(x)|' 

This quantity gives the ratio of the height of a level shift that can be detected by a CUSUM 
test over the height that can be detected by a Wilcoxon change-point test, when both tests 
are assumed to have the same level a, the same power /3 and the shifts are taking place at 
the same time [nr]. In addition, we assume that the tests are based on the same number of 
observations n, which is supposed to be large. 

Example 4.1. In the case of Gaussian data, i.e. when G{S,) = ^, we have m = 1, oi = 
^(6^1 (6)) = E{^1) = 1, J^p{x)dx = 4^e-^'c/x = 2^ and J^Ji{x)dF{x) = see 
Dehling, Rooch and Taqqu (2013), Relation (20). Thus we obtain 

(56) i^^lZHv^^l 

Cw l/2Vvr 

Hence, both tests can asymptotically, as n — )■ oo, detect level shifts of the same height. 
The level shifts can be expressed in terms of 



m ■■= P sup {Bh{\) - \BH{l)+t<P,{\)) > 

Vo<A<l 

viewed as a function of t, for fixed values of r and a. The function ip is monotonically 
increasing. We define the generalized inverse, 

^-(/3) := inf{t > : i/j{t) > /?}. 
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Thus, we get 

(57) P ( sup [Bh{\) - \Bh{1) + 0.(A)) > q}\ > (3, 

Vo<A<l / 

and, in fact, for given r, a and /3, is the smallest number having this property. 

We can now apply Theorem 12.11 and Theorem 13.11 By comparing fl53l) and fl571) , we can 
detect a level shift of size h at time [nr] with a CUSUM test of level a and power /3 based 
on n observations, if hc{n) ~ ^c^, where cc satisfies = ip~{(3). Hence we obtain that 
hciji) has to satisfy 

hc{n) = ^\a,\n/3)=n-''/'L{n)\a,\rW). 



Similarly, by comparing and fl571) . we get for the Wilcoxon change-point test that n has 
to satisfy 

In the following theorem, we compute the asymptotic relative efficiency of the Wilcoxon 
change point test with respect to the CUSUM test. 

Theorem 4.2. Let {^i)i>i be a stationary Gaussian process with mean zero, variance 1 
and autocovariance function as in Q). Moreover, let G : M. ^ M. be a measurable function 
satisfying £^(G^(^i)) < oo, and such that G{C,i) has a distribution function F{x) with bounded 
density f{x). Assume that the Hermite rank of G{^i) as well as the Hermite rank of the class 
of functions {^{G{^i)<x} ~ F{x)), x G M are equal to 1. Moreover assume that < D < 1. 
Then 

^(^^)^'"' 

where Tq and Ty/ denote the GUSUM test and the Wilcoxon change-point test, respectively. 
Proof. For abbreviation, we define 

/^Ji(x)rfF(x)r^/^ 



(59) 



I Oil /jg f^{x)dx 



We will show that the Wilcoxon change-point test based on bn observations has asymptot- 
ically the same power as the CUSUM test based on n observations. We will consider the 
local alternative 

(60) = A,^,c = A^^^^in) 

' n 

for the CUSUM test, and the local alternative 

(61) < = ^.,e = ^.,cA.„/,(^) 
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for the Wilcoxon change-point test. Note that under the level shift is the same as under 
y4^. Further observe that, by f lTT]) . 

b , b 



^r-c-d./, = c-Ky\n/br''/'L'/\n/b) 
n n 



b 1/2 l-D/2rl/2/ N,D/2-l:^l^^^^K/^ 

^ ( L{n/b) 

n \ L{n) 



1/2 



(62) ~ c'^b'"\ 

n 

where we have used the fact that L{n) is a slowly varying function. 

For the CUSUM test, we can apply Corollary 12.31 and we obtain under the local alternative 
A^, that 

A sup {{Bh{\) - \Bh{1)) + ^c0,(A)}. 
|ai| o<A<i pil 

For the Wilcoxon change-point test, we apply Corollary 13.31 with c replaced by c6^/^, in 
view of f E2]) . We thus obtain under the local alternative , 

H-TtW^^" ^ sup UBn{\)-\Bn{l))+ . .V'tpr ^ ^^'""^riX)] 
\ }^Ji[x)dF{x)\ o<A<i I \ }^Ji{x)dF{x)\ J 

= sup {{Bh{\) - XBhH)) + T^X<Pr{\) 

0<A<1 t 1*^11 

by ( 162|) . Let denote the upper a-quantile of the distribution of supo<A<i{-Bii/(A)— Ai?i^ (1)}; 
then the test that rejects the null hypothesis when j^-Dn > Qa or when | j-^(^)dF(x)\ — 
respectively, have asymptotically level a. The power of these tests at the alternatives 
and , respectively, converges to 

P ( sup [{Bh{\) - \Bh{1)) + ^c0,(A)l > . 

\0<A<1 t Pl| J / 

Note that this also holds for the power along any other sequence, such as bn. Since the level 
shift at the alternative equals the level shift at the alternative we have shown that 
the Wilcoxon change-point test requires bn observations to yield the same performance as 
the CUSUM test with n observations. Thus ARE{T\YiTc) = 1/b, proving the theorem. □ 

5. ARE OF THE Wilcoxon Change-Point Test and the CUSUM Test for IID 

Data 

We have shown in Example 14.11 that in the case of LRD data, the ARE of the Wilcoxon 
change-point test and the CUSUM test is 1 for Gaussian data. In this section, we will 
compare this surprising result with the case of i.i.d. Gaussian data. We will find that in this 
case, the ARE is ^, i.e. the Wilcoxon change-point test is less efficient than the CUSUM 
test. 

We consider i.i.d. observations Xi, . . . , with Xi ~ A/'(0, 1) and the fZ-statistic 

k n 

1=1 j=k+l 
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i, in other words we 



As kernel we will choose hc{x,y) = y — x and hw{x,y) = I{x<y} 
consider 

k n k n 

= E E hc{x,^x,) = E E (^^- - ^^)' 

1=1 j=k+l i=l j=k+l 

k n k n / 

Both kernels hc,hw are antisymmetric, i.e. they satisfy h{x,y) = —h{y,x), so in order to 
determine the limit behaviour of ujf'^ and uj^^\ we can apply the limit theorems of Csorgo 
and Horvath (1988). 

Theorem 5.1. Let Xi, . . . ,Xn be i.i.d. random variables with Xi ~ A/'(0, 1). Under the null 
hypothesis of no change in the mean, one has 

1 



(63) 
and 

(64) 



sup 

0<A<1 



- 5i?l,n(A) 



n 



3/2 [An] 



Op(l) 



sup 

0<A<1 



" ^' 12 



</ - i?i?2,n(A) 



0P(1), 



where {BBi^n{^))o<x<i , ^ = 1, 2, zs a sequence of Brownian bridges with mean E[BBi^n{^)] = 
and auto-covariance E[BBi n{s) i?5j^„(t)] = min(s,t) — st. 

Proof. By Theorem 4.1 of Csorgo and Horvath (1988), it holds under the null hypothesis H 
that 



sup 

0<A<1 



1 



Op(l) 



where (i?i?„(A))o<A<i is a sequence of Brownian bridges like BBi n and i?i?2,n above and where 
= E[h'^{Xi)] with h{t) = E[h{t, X^)]. The kernel h has to fulfill E[h'^{Xi, X2)] < 00 which 
is the case for hc{x, y) = y — x and hw{x, y) = I{x<y} — \ and Gaussian X^. □ 

Theorem 5.2. Let be i.i.d. random variables with Xi ~ A/'(0, 1). Under the 

sequence of alternatives A^^h^{n) and with hn = -^c, where c is a constant, one has 

^ "^^^^ ^ iBB,{X) + cM>^))o<x<i 



(65) 
and 



3/2 [An] 



0<A<1 



(66) 



" ^' 12 



[An] 



^ 552(A) 



0<A<1 



2^/7F• 



</>r(A) 



0<A<1 



m distribution, where (55j(A))o<A<i a Brownian bridge, i = 1,2. 



Proof. First, we prove (!65|) . Like for the case of LRD observations, we decompose the 
statistic, so that we obtain under the sequence of alternatives Ar.h.„ (n) 

[An] „ ^ [An] 



n 



3/2 ^ [An] 



^3/2 



i=l j = [An]+l 



i=l j=[An]+l 
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By Theorem 15 -H the first term on the right-hand side converges to a Brownian bridge BB(X). 
For the second term we have like in the proof for LRD observations 



^ [An] 
^ i=l j=[\n]+l 

and in order for the right-hand side to converge, we have to choose 
(67) K = j^c. 



Now let us prove fl66|) . Again hke for LRD observations, we decompose the statistic into one 
term that converges hke under the null hypothesis and one term which converges to a con- 
stant. Under the sequence of alternatives Ar^h^i'^) and for the case A < r, this decomposition 
is 

^ ^ [An] ( 1\ 1 ''^"^ ^ 

«=1 j=[An]+l «=1 j=[rn]+l 

The first term converges uniformly to a Brownian Bridge, like under the null hypothesis. We 
will show that, if the observations = G[^i) are i.i.d. with c.d.f. F which has two bounded 
derivatives F' = f and F", the second term converges uniformly to cA(l — r) j^p{x)dx, 
which is c0,-(A) f^P{x) dx for the case A < r. In the case of standard normally distributed 
G{^i), i.e. for F = $ and f = ip, this function is c(2A/7r)~^0T-(A). To this end, we consider 
the following sequence of Hoeffding decompositions for the sequence of kernels hn{x,y) = 

hy<x<y+hn}- 

(69) hn{x, y)=^n + /il,„(x) + h2^n{y) + /i3,n(a;, V) 

Let X, y ~ F be i.i.d. random variables. Then we define 

dr,:=E[K{X,Y)] = P{Y <X <Y + K) 

fy+hn 



f{x) dx ] f{y) dy 



j {F{y + K)-F{y))f{y)dy 
i I ^^^±^-^m dy 

./TCP iLn 



(70) ~ K / r{y) dy 



where in the last step we have used that {Fijj + hn) — F{y))/hn — f{y) and the dominated 
convergence theorem. Moreover, we set 

hl,n{x) := E[hn{x, ¥)] ~ dn = E[I{Y<x<Y+h„}] " 

(71) = F{x)-F{x-K)-^n 



and 



h,n{y) ■= E[K{X, y)] -'&n = E[I{y<X<y+h^}] - '&n 

= F{y + K)-F{y)-^^ 
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/i3,n(x, y) := hn{x, y) - /ii,„(x) - /l2,n(Z/) - 

= I{y<^<y+h^} - ^(^) + ^(^ - hn) +^n- F{y + K) + F{y). 
We will now show that 



(72) 



sup 

0<A<r n 



3/2 



[An] 



(/ii,n(ei) + /i2,n(ej) + /i3,n(ei, e^)) 

i=l j = [rn.]+l 



^ 



in probability, and from this it follows by the sequence of Hoeffding decompositions (169|) that 



0<A<r n-^'^ 



[An] n 
j=l j = [rn]+l 



^ 



i.e. that the second term in fl68l) converges uniformly to 

^ [^"] " 1 /" 

lim — -- 6n = lim -—j-[Xn]{n - [™])^?.„ = A(l - r)c / f^{x) dx 

i=l i=[rn] + l 

by dZO]) and (1^. 

We use the triangle inequality and show the uniform convergence to for each of the three 
terms in (I72p seperately. Since the parameter A occurs only in the floor function value [An], 
the supremum is in fact a maximum, and the /ii,n(ei) are i.i.d. random variables, so we can 
use Kolmogorov's inequality. We obtain for the first term in (172|) 



(73) P sup 



n — [rn] 



o<A<r n 



3/2 



[An] 








i=l 





,2/1 _\2 [™1 



1 1-r) 



X Var[/ii,„(ei)]. 



1=1 



Now consider an independent copy e of the and the Taylor expansion of F around the 
value of e, 



F(6 + K) = F(t) + F'(t)K + 



where the last term is the Lagrange remainder and thus e + 5 is between e and e + Then 
we obtain 



^Var[/ii,„(e)] = Var 



Var 



F(e) - F{e - K) 

K 

m + F"{e + 6)^ 



F"{e + 6)^ 



Var [/(e)] + Var 
+ 2 iE[fie)F"ie + 5)/i„] - i?[/(e)] E[F"ie + 6)hn\) 



and since f = F' and F" are bounded by assumption, we get Var[/ii_„(e)] = 0{hD. Since 
hn — >■ 0, the right-hand side of (!73|) converges to as n increases. 
In the same manner, we obtain 



(74) 



P I sup 



[\n] 



o<A<T n' 



3/2 



X ^2,n(f 
j = [rn]+l 



1 2 \ 2 " 

> s I < ^^XV^^[^2,n(ei)] ^ 0. 



g2 ^3 
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(75) 



sup ^TR 

0<A<r n^'^ 



[An] 



i=l i=[Tn\-\-\ 







in probability. We set temporarily / := [An] and T := [rn] and obtain from Markov's 
inequality 

2 



P I max 

o<i<T n^l^ 



I n 
i=l j=T+l 



> S \ < -rE 



I n 



max , 
o</<T n^l^ 



3/2 hs^ni^^i, ej, 



i=l j=T+l 



Now for any collection of random variables Yi, . . . , Y^, one has E\m&x{Y^, . . . Y^}] < X]i=i 
so that 



I n 



max 

o<i<T ri'^i 



i=l j=T-|-l 



1 1 ^ 



I n 



.i=li=T+l 
^ ^ T « n 

= i^;^5Z5Z 5Z Var[/i3,„(e„e,)], 
i=i i=i j=r+i 

where in the last step we have used that h^,n{^ii^j) are uncorrelated by Hoeffding's decom- 
position. Now for two i.i.d. random variables e, 77, we have, like above with the Taylor 
expansion of F: 

Var [/i3,„(e, //)] = Var [/{^<.<^+h„} - F{e) + F(e - /i„) + - F(r/ + + F(r/)] 
= Var [/{,,<e<,,+/tn} - hn (/(e) + OpQin)) + /i„ (/(r/) + Op(/i„))] 
= Var [/{^<.<^+h„}] + Var (/(e) + /(r/) + Op(/i„))] 

+ 2C0V [I{rj<e<rj+h^}, K (/(c) + /(^) + Op(/l„))] 

< (^„ - ^^,) + /i^O(l) + 2^{^^-^l)-hlO{l) 
= 0(/i„), 

using flTOj) . We have just shown that 

I n 

Y Y ^3,n(ei,ej) 

i=l j=T+l 

which proves (1751) . So we have proven fl68p for the case A < r. The proof for A > r is 
similar. □ 

Now the stage is set to calculate the ARE of the Wilcoxon test based on u'^^ and the 

CUSUM test based on U^^^^^ defined in the section about the ARE in the LRD case. Let 
qa denote the upper a-quantile of the distribution of suPq<;^<^ BB{\). By Theorem 15.21 the 
power of the tests is given respectively by 

(76) 

and 
(77) 



P I max — 7- 

o<i<T ri^/^ 



>S] < ^0{hn), 



P ( sup {BB{\) + Cc0r(A)) > q, 

Jd<\<i 



0<A<1 



P sup BB{X) + cw 



a ■ 2^ 



0r(A) > q 
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where cx^ = 1/12 and we assume that 

hw{n) = hc{n) = 

'n \ n 



Thus if we want both tests to have identical power, we must ensure that cq = cw/{o' ■ 21/7?), 
in other words 

hcjn) _ cc_ _ 1 
hw{n) cw a-2y/T[ 
Now we define, as in the proof for LRD observations, the probabihty 

tij{t) ■=p( sup (55(A) + t0,(A)) > 

\0<A<1 

for whose generahzed inverse ip' holds 



(78) 



P ( sup {BB{\) + ij~{(3) <t>r{X)) > Qo) > /3. 

Vo<A<l / 



Now, comparing (178|) and ( !76|) . we conclude that we can detect a level shift of size h at time 
[nr] with the CUSUM test of (asymptotic) level a and power /3 based on n observations, if 
hciji) = ^ and where cc satisfies cc = ip~{(5)] hence we obtain that hciji) has to satisfy 



'n 



In the same manner, we get for the Wilcoxon test the conditions hw{n) = ^ and Cvi//(cr2-y/7r) 
ip~{P) and thus 

hw{n) = ^V^-(/3). 



Solving these two equations for n again and denoting the resulting numbers of observations 
by nc and nw, respectively, we obtain 

' 

To obtain ARE{Tw,Tc), we equate hw and he- We then obtain the following theorem. 
Theorem 5.3. Let Xi, . . . ,X„ be i.i.d. random variables with Xj ~ A/'(0, 1). Then 

(79) ARE{Tw, Tc) = lim ^ = {2a = -, 

where Tc, Tw denote the one-sided CUSUM-test, respectively the one-sided Wilcoxon test, 
for the test problem {H, AT,h„)- 



6. Simulation Results 

We have proven that for Gaussian data, the CUSUM test and the Wilcoxon change-point 
test show asymptotically the same performance, i.e. that their ARE is 1. For Pareto(3,l) 
distributed data, we obtain, using (1581) and numerical integration, an ARE of approximately 
(2.68)^/'^. Now we will illustrate these findings by a simulation study. 
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6.1. Gaussian data. We consider realizations ^i,. . . of a fGn process witli Hurst pa- 
rameter H = 0.7 {D = 0.6), using the f Arma package in R, and create observations 




Gte) for2 = l,...,[nA] 
G{^i) + h for i = [nA] + l,...,n' 



by applying a transformation G which is (with respect to the standard normal measure) 
normalized and square-integrable: E[G{^)] = 0, E[G'^{^)] = 1 for ^ ~ A^(0,1). As a first 
step, we choose G{t) = t in order to obtain Gaussian observations Xi, . . . , Xn (later we 
will choose a function G such that we obtain Pareto distributed data). In other words, we 
consider data which follow the local alternative 

^ (fi = E[Xi] = fori = l,...,[nr] 

1 /i = -^[Xj] = h for i = [nr] + 1, . . . , n, 

as in ([8]). In contrast to the simulations by Dehling, Rooch and Taqqu (2013), we choose a 
sample size n = 2, 000 instead oin = 500. We let both the break point k = [rn] and the level 
shift h := /ifc+i — yUfc vary; specifically, we choose k = 100,200,600, 1000 (which corresponds 
to r = 0.05,0.1,0.3,0,5) and we let h = 0.5,1,2. For each of these situations, we will 
compare the power of the CUSUM test and the power of the Wilcoxon change-point test in 
the test problem {H,Ax^h)- We have repeated each simulation 10,000 times and counted, 
how often the respective test (correctly) rejected the null hypothesis. 

Since our theoretical considerations yield an ARE of 1, we expect that both tests detect 
jumps equally well - that means that both tests, set on the same level, detect jumps of the 
same height and at the same position in the same number of observations with the same 
relative frequency. And indeed, we can clearly see in Table [1] that the power of both tests 
approximately coincides at many points; differences can be spotted only when the break 
occurs early in the data. 





relative jump position r 
0.05 0.1 0.3 0.5 


relative jump position r 
0.05 0.1 0.3 0.5 


h=0.5 
h=l 
h=2 


0.074 0.153 0.767 0.874 h=0.5 
0.153 0.694 1.000 1.000 h=l 
0.828 1.000 1.000 1.000 h=2 


0.072 0.143 0.765 0.876 
0.128 0.602 1.000 1.000 
0.321 1.000 1.000 1.000 



Table 1. Power of the CUSUM test (left) and of the Wilcoxon change-point 
test (right), for n = 2000 observations of fGn with LRD parameter H = 
0.7, different break points [rn] and different level shifts h. Both tests have 
asymptotically level 5%. The calculations are based on 10,000 simulation runs. 



6.2. Heavy tailed data. We consider again realizations ^i, . . . of a fGn process with 
Hurst parameter H = 0.7 {D = 0.6) and create observations 



G(e.) forz = l,...,[nA] 
G{^i) + h for z = [n\] + 1, . . . , n 



X, 

by applying the transformation 



Git) = -= ( imr'^' - - 
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In this case, the first Hermite coefficient of G, obtained by numerical integration, equals 
ai ^ —0.6784. This transformation G produces observations Xj = G{^i) which follow 
a standardized Pareto(3, 1) distribution with mean zero and variance 1. The probability 
density function of Xi is given by 




if a; > -a/^ 



3 

else. 



To the second sample of observations, X[^„]+i, . . . ,Xn, we again add a constant h, but this 
time we choose 

(80) h = K = c^ = cn'"'' 

n 

as in (ITS!) . We let the break point k = [rn] vary; here, we choose r = 0.05,0.1,0.3,0.5. We 
let also the sample size vary; we will give details below. To these data, we have applied 
the CUSUM test and the Wilcoxon change-point test, and under 10, 000 simulation runs we 
counted how often the respective test (rightly) rejected the null hypothesis. 
Now our theoretical considerations, see fl58p . predict for this situation 

ARE =\un^=( I«iI/r/V)^^ y^"" ^ (2.67754)2/0-6 ^ 26.655. 
- nw Wj^Jii^lfix) dx\J 



This means that the CUSUM test needs approximately 26.66 times as many observations 
as the Wilcoxon test in order to detect the same jump on the same level with the same 
probability. In order to find this behaviour, we have analysed the power of the Wilcoxon 
test for nw = 10, 50, 100, 200 observations and the power of the CUSUM test for nc = 
266, 1332, 2666, 5330 observations. 

In order to be able to compare the two tests, we need to have identical level shifts when 
applying the Wilcoxon test to a sample of size nw and the CUSUM test to a sample of 
size nc = 26.655 nw- This can be achieved by choosing the constants c in flHOj) accordingly, 
namely taking cc = 2.67754 cw- In this way, we obtain 

hnc = ^cnj"'' = 2.67754ciy(26.655nH/)-^/' = ciyn^^/' = 

We ran simulations for two different choices of cw, namely cw = ^ and cw = 2; see Table [3] 
and Table |4] for the results. 

Here, we have to face a problem which was already encountered by Dehling, Rooch and 
Taqqu (2013). For the heavy-tailed Pareto data, the convergence of the CUSUM test statistic 
towards its limit is so slow that the asymptotic quantiles of the limit distribution are not 
appropriate as critical values to define the domain of rejection of the test: In finite sample 
situations, the observed level of the test is not 5% - as it should be when using the 5%- 
quantile of the asymptotic limit distribution. In order to remedy this, we used as critical 
value for the test, the finite sample 5% quantiles of the distribution of the CUSUM test 
statistic under the null hypothesis, using a Monte Carlo simulation; see Table 6 in Dehling, 
Rooch and Taqqu (2013). Here, we have performed the same steps, but for sample sizes 
n = nc = 266, 1332, 2666, 5330. The results are given in Table[2j Note that this problem does 
not arise when using the Wilcoxon change-point test, since the Wilcoxon test is distribution 
free under the null hypothesis. 

The simulation results are shown in Table [3] (for cw = 1) and Table H] (for cw = 2). Indeed, 
for a fixed jump position r, the power of the CUSUM test (for n = nc = 266, 1332, 2666, 5330 
observations) and of the Wilcoxon test (for n = nw = 10, 50, 100, 200 observations) coincide. 
They are not fully equal, but we conjecture this is due to the small sample size which conflicts 
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n 


266 


1332 


2666 


5330 oo 


9emp,0.05 


0.73 


0.66 


0.64 


0.63 0.59 



Table 2. 5%-quantiles of the finite sample distribution of the CUSUM test 
under the null hypothesis for Pareto(3,l)-transformed fGn with LRD param- 
eter H = 0.7 and different sample sizes n = nc- 

with the asymptotic character of our results. But it becomes clear: The CUSUM test needs 
quite a number of observations more to detect the same jump on the same level with the 
same probability - as predicted by our calculation around 25 times as many. 







relative jump position r 






relative jump position r 


n 


h 


0.05 0.1 0.3 0.5 


n 


h 


0.05 0.1 0.3 0.5 


266 


0.50 


0.049 0.049 0.066 0.088 


10 


0.50 


0.036 0.025 0.033 0.079 


1332 


0.31 


0.050 0.052 0.083 0.110 


50 


0.31 


0.049 0.051 0.093 0.120 


2666 


0.25 


0.052 0.055 0.092 0.127 


100 


0.25 


0.050 0.053 0.102 0.134 


5330 


0.20 


0.051 0.054 0.099 0.130 


200 


0.20 


0.051 0.055 0.103 0.134 



Table 3. Power of the CUSUM test (left) and of the Wilcoxon change-point 
test (right), at different break points [rn], different sample sizes n, and differ- 
ent jump heights /i, for Pareto(3,l) distributed data. Both tests have asymp- 
totically level 5% (CUSUM test is performed with empirical quantiles). The 
calculations are based on 10,000 simulation runs. 







relative jump position r 






relative jump position r 


n 


h 


0.05 0.1 0.3 0.5 


n 


h 


0.05 0.1 0.3 0.5 


266 


1.00 


0.049 0.054 0.162 0.259 


10 


1.00 


0.033 0.024 0.039 0.197 


1332 


0.62 


0.052 0.062 0.236 0.345 


50 


0.62 


0.049 0.055 0.199 0.283 


2666 


0.50 


0.055 0.069 0.272 0.390 


100 


0.50 


0.051 0.063 0.225 0.316 


5330 


0.41 


0.054 0.074 0.287 0.402 


200 


0.41 


0.054 0.066 0.242 0.338 



Table 4. Power of the CUSUM test (left) and of the Wilcoxon change-point 
test (right) at different break points [rn], different sample sizes n, and different 
jump heights h, for Pareto(3,l) distributed data. Both tests have asymptot- 
ically level 5% (CUSUM test is performed with empirical quantiles). The 
calculations are based on 10,000 simulation runs. 
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